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ABSTRACT 

The stability of reading performance, as measured by 
the Metropolitan Achievement Tests, Iowa Tests of Basic Skills, and 
Iowa Tests of Educational Development, was studied using students in 
grades 1 through 7 and grades 9 and 11. A reading vocabulary test and 
a reading comprehension test are included in all threq test 
batteries. The standard scores on the three tests were pooled to 
obtain a composite reading score for three independent samples of 
students. Sample I consisted of grades 3-6 and 9 and 11, the number 
of students varying from a low of 71 (grade 5) to a high of 1,116 
(grade 9) ; Sample II was made up of students from grades 1-7 and 
grade 9, the number varying from 520 (grade 2) to 1,240 (grade 7) ; 
and Sample III contained students from grades 1-6, varying in number 
from 1,095 (grade 6) to 1,320 (grade 1). Results of the study showed 
that substantial long-term stability was reflected in both the 
vocabulary and comprehension tests; grade 1 scores correlated above 
.5 with all subsequent measures. By the end of the primary grades, 
students* scores correlated above .70 with all subsequent measures. 
When the coefficients were correlated for attenuation to allow an 
estimate of the relationships after errors of measurement on the test 
were removed, the values were about .10 higher. It is concluded that 
although reading does not represent temporary maturational status for 
most pupils, it does have substantial relationship with terminal 
achievement levels in both reading vocabulary and comprehension. 
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Is early success In reading related to long-term reading competency? 

Are differences in initial reading success the result of age and maturatlonai 
differences that dissipate with time? Are "slow starters" just immature 
pupils who eventually achieve normally or do they continue to have 
"ignition" trouble? Although there has been considerable research on 
the question of IQ constancy, it is surprising that so little attention 
has been devoted to the related issue of achievement stability, a topic 
of much greater educational and social importance. 

Bloom's survey of research on the predictability of achievement 
data failed to locate any research that studied even short-term conse- 
quences of grade on reading performance. The published studies on the 
topic of achievement constancy in the area of reading revealed by the 
literature search are summarized in Table 1. The information In Tabic 1 
reveals that most studies have utilized small N's (which yield unreliable 
stability estimates) and that all published studies have extended over a 
five-year interval or less. 
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Table 1 

Summary of Studies on the Stability of Reading Performance'. 



TEST 


N 


GRADES 


r 


Metropolitan Reading 


105 


2-5 


. 76 


Standford Reading 


47 


2.9 - 6.9 


.67 


Stanford Paragraph Meaning 


81 


5-6 


.77 


ITBS Reading 


27 


6.9 - 8.9 


.75 


Cooperative Reading 




7-12 


.77 


Comprehension 




8-12 


.76 






9 - 12 


.82 






10 - 12 


.82 


Nelson Denny Reading 


517 


13 - 14 


.83 


ITBS and I TED 


256 


7-9 


.83 




251 


7-11 


.79 


ITBS Reading 


900- 


3-8 


.77 


ITBS Vocabulary 






.76 


ITBS Reading 


9972 


5-8 


.79 


ITBS Vocabulary 






.83 



INVESTIGATOR 



Townsend (1944) 
Hildredth (1916) 



Kvaraceus and Lanigan 
(1948) 

Traxlcr (1950) 



Silvey (1951) 

Merenda and Jackson ( 196.°', |<W 
Linn (1969) 
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The purpose of this study was to investigate the stability of 
reading performance as measured by standardized tests at various intervals 
over the initial eleven grade levels. 

Method 

Reading tests from three popular achievement batteries were used 
in the study: Metropolitan Achievement Tests (MAT), Iowa Tests of Basic 

Skills ( ITBS.) , and Iowa Tests of Educational Development (ITED). The 
use of different achievement tests is bpth a strength and a weakness. 
Varying the tests increases the generalizability of findings— -showing that 
the results are not limited to a particular measuring instrument. Ac the 
same time the nature of the variables being measured differs somewhat 
among the tests, hence the stability estimates will be conservative in 
nature. 

Tests were administered annually in grades 1 through 7, and also in 
grades 9 and 11. A reading vocabulary test is included in all three test 
batteries at each grade level, (in the MAT battery the test is called 
Word Knowledge.) A reading comprehension test is included in all three 
test batteries, in fact three reading comprehension tests are included 
in the ITED battery: Ability to Interpret Reading Materials in the Social 

Studies, Ability to Interpret Reading Materials in the Natural Sciences, 
and Ability to Interpret Literary Materials; hence the standard 
scores on the three tests were pooled to obtain a composite reading score 
for each student. 

Three independent samples of students were included in the study: 

The means and standard deviations of the reading vocabulary and compre- 
hension scores are reported for each grade level in Table 2, along with 
sample sizes and dates of testing. The standard deviations and means arc 



O 

ERIC 



3 



Or. K . I). Hopkins 
Or. C. II. Orach t 

4 

based on the scores of all students In a grade level. In order to determine 

whether the degree of variability in the samples differed from the population 

variability, the standard deviations of grade-equivalent scores were compared 

with corresponding values for the norming population reported In the test 

manuals. (MAT and ITED estimates were computed from s and r, , : s = s /l~- r"7T. 

e i 1 ell 

A comparison of these values with the sample values revealed the sample had 
greater variability in 18 of 39 instances and less variability in the remaining 
21 comparisons. In most cases the differences were small. Hence the varinhilttv 
in the samples appears to be quite representative. 

The standard deviations were also computed for only the students present 
at the most extreme grade levels to assess potential selection effects on the 
variability within the samples. The average variability of this smaller samole 
differed only slightly (about .03o in grade 1 and .02o in grade 11) from the 
total sample within each grade level, hence the stability coefficients are not 
non-representative as a consequence of atypical sample variabilitv. 

Achievement Stability 

The stability and generalizability coefficients for the reading vocabulary 
scores are reported in Table 3. Any student who was enrolled in any two or 
more of the grade levels during the specified years was included in the 
sample. The N on which the stability coefficient is based is given below the 
diagonal for each sample. For example, in Sample II the correlation between 
grade one and grade two reading vocabulary was .64, which was based on 415 
pairs of scores. A factor which must be kept in mind in the interpretation 
of the stability coefficients is the change in test battery at grades 4 and 
9. The slight but generally consistent decreases in stability at grades 
4 are probably related to the change in test battery rather than a real change 

f 
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Table 2 5 

Grade Levels, Sample Sizes, Means and Standard Deviations, and Date of 
Testing for Samples I, II, and III. 



SAMPLE 







Vocab ulary a 


Comp rehens ioh^ 


















Date of 


Test arid 


Grade 


_N 


X° 


s_ 


x c 


s 


Testing 


Form 


3 


461 


4.64 


1.37 


4.79 


1.49 


3/60 


MAT, Klein. A 


4 


452 


4.92 


1.31 


5.08 


1.66 


9/60 


ITBS, 1 


5 


71 


6.36 


1.57 


6.32 


i . 79 


9/61 


ITBS, 2 


6 


1024 


7.37 


1.73 


7.31 


1.53 


9/62 


ITBS, 1 


7 


1065 


8.56 


1.62 


8.39 


1.58 


10/63 


ITBS, 2 


9 


1116 


16.36 


5.52 


15.33 


5.30 


11/65 


ITKI), X4 


11 


919 


19.83 


5.42 


18.77 


5.83 


12/67 


ITKI), Y4 


1 


540 


2. 10 


.50 


2.12 


.62 


3/60 


MAT, I'rlrn. 1, A 


2 


520 


3.35 


.91 


3.63 


.85 


3/6 1 


MAT, Prim. II, B 


3 


1115 


4.65 


1.27 


4.72 


1.38 


3/62 


MAT, Klemv B 


4 


1115 


4.92 


1.14 


5.07 


1.52 


9/62 


ITBS, 1 : 


5 


1185 


6.30 


1.51 


6.28 


1.76 


10/63 


ITBS, 2 


6 


1195 


6.91 


1.48 


6.90 


1.47 


10/64 


ITBS , 3 


7 


1240 


8.14 


1.57 


7.81 


1.56 


12/65 


ITBS ,4 


9 


1050 


16.06 


5.22 


15.30 


5.30 


12/67 


ITKI), X4 


1 


1320 


2.18 


.53 


2.19 


.65 


3/63 


MAT, Prim. 1 ,_ A 


2 


1250 


3.45 


.92 


3.63 


.90 


3/64 


MAT, Prim. Tl, B 


3 


1275 


4.80 


1.25 


4.88 


1.43 


3/65 


MAT, Klem. A 


4 


1315 


4.77 


1.11 


4.71 


1.28 


10/65 


ITBS, 1 


5 


1140 


5.84 


1.25 


5.97 


1.38 


9/66 


ITBS, 2 


6 


1095 


6.92 


1.44 


6.98 


1.47 


9/67 


ITBS, 3 



a "Word Knowledge" test on the MAT 

k "Reading" tests of the MAT and ITBS; average of three reading comprehension 
tests of the I TED. 
c 

Grade equivalent units on the MAT and ITBS; standard scores on the TTEI) 
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Table 3 



Stability Coefficients (Above Diavsnals) and Corresponding Ns (below Diagonals) 
for Reading Vocabulary Scores for Samples I, II, and III, 



TESTS AND GHAOK LKVK1. 



Sample I 






MAT 






ITBS 






ITKI) 


"" * 




Grade 


Test 


1 


2 


_3 


A 


5 


6 


7 


9 


~7T~ 


tt 


3 


MAT 3 








.82 


.86 


.81 


.79 


.79 


" 76 


.95 | 


A 


ITBS 






388 




.80 


.80 


. 76 1 


.78 


.77 


.9 1 i 


5 


I TBS 






368 


39A 




.89 


.87 


.85 


.83 


.91 -J 


6 


ITBS 






352 


373 


776 




.88 


.87 


.85 


.91 r 


7 


ITBS 






3A1 


362 


707 


891 




.88 


.87 


.89 j 


9 


ITED 






31A 


32A 


611 


751 


836 




.91 


.93 


11 


ITED 






285 


281 


532 


6A2 


69 7 


878 


•• •/ 


.95 


Sample II 


a 




















1 


1 


MAT 




. 6A 


.56 


.56 


.52 


.55 


.56 


.51 




. 79 j 


2 


MAT 3 


A15 




.78 


.72 


.65 


.66 


.66 


.59 




.92 j 


3 


MAT 


A00 


A 35 


t 


.80 


.79 


. 76 


.7A 


.67 




.95 | 


A 


ITBS 


375 


A00 


975 c 




.82 


.79 


.76 h 


.71 




.88 j. 


5 


ITBS 


355 


375 


885 


990 




.85 


.83 


.81 




.91 


6 


ITBS 


3A5 


360 


825 


910 


1030 




.85 


.81 




. R8 


7 


ITBS 


320 


335 


765 


8A0 


930 


10A0 




.83 




. 8R 


9 


ITBS 


300 


305 


685 


7A0 


795 


870 


1005 






.93 I 

1 ’ 


Sample III 


a 




















. \ 

\ 

i 


1 


MAT 




.70 


.6A 


.55 


.58 


.56 








.82 l 


2 


MAT 3 


1000 




.80 


.70 


.69 


.65 








.93 i 


3 


MAT 


885 


995 




.78 


.78 


.76 








.95 1 


A 


ITBS 


815 


895 


1090 




.80 


.76 








.87 


5 


ITBS 


730 


790 


9A0 


1070 




.83 








.86 1 


6 


ITBS 


720 


770 


915 


1025 


1025 










• R8 i 



From test manuals, adapted to the variability of the sample using formula 
given by Guilford (195A, p, 392). 

b Corresponding value from Merenda and Jackson (1969) vias also .76. 

c There was. an increase in sample size at this grade level due to school 
district reorganization. 
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in the stability of reading vocabulary. The decreases are slight, however, 
indicating that the stability is not limited to intra-battery inferences but 
is quite general across competing achievement batteries. 

Reading vocabulary scores were only moderately stable from grade 1 
through grade 3 (r = .6), which suggested considerable stability in pupils' 
scores on the vocabulary tests . The individual differences in pupils' 
reading vocabulary at grade one represents a definite lasting characteristic 
for the group. The correlation of .51 between vocabulary scores at grades 
one and nine indicates that, on the average, a pupil tended to be about 
half as far from the grade nine as he was from the grade one mean. Notice 
that the stability coefficients tend to increase as grade level is Increased 
Notice also that there is' little loss in stability after the initial two 
or three years. The stability of reading vocabulary scores in grade two 
was considerably greater than that reflected in grade one performance, 
correlating about .6 with scores seven years later. Beginning at grade 3, 
the stability of reading vocabulary achievement was maintained at a high 
level through the secondary grades with an eight-year stability of .76 in 
Sample I. The stability of reading vocabulary achievement was extremely 
high for all groups by the beginning of grade five, with the stability 
coefficients approaching the tests' reliabilities. The generally higher 
stability coefficients in Sample I can be explained partially by the 
slightly greater variability of scores (cf. Table 2). 

Since the correlation coefficients in Table 3 reflect true change 
and stability of reading vocabulary plus errors of measurement, the 
coefficients were corrected for attenuation to provide an estimate of the 



Dr. K. D. Hopkins 
Dr. C. II. Urncht 

i . 8 

stability of true scores in reading vocabulary. These disattcntuatcd 
stability coefficients are given in Table 4. 

The values in Table 4 provide estimates of the degree of relationship 
between reading vocabulary performance free from the contamination effects 
from errors of measurement, and hence address the theoi .'tical issue of 
true stability better than the corresponding values found in Table 3. If 
the reliability of the MAT and ITBS reading vocabulary tests were to be 
increased to 1.0, the correlation between grade one with grade six scores 
would be expected to be .71 and .65 with grade nine scores, reflecting 
substantial long range implications of initial reading success. True reading 
vocabulary scores near the end of the primary cycle (2.7) were very highly 
related (r’s ° .83 - .88) to true scores in grade six and grade eleven 
(r = .80). The rank-order of pupils’ true reading vocabularies change very 
little after four years of formal reading instruction (i.e. after 5.1), with 
disattenuated correlations with all measures thereafter approaching .9 or 
higher. 

It can be concluded that reading vocabulary near the end of grade 
one gives a good indication of the reading vocabulary of pupils ten years 
later; the indication is excellent after the completion of grade four. 

Reading Compr ehens ion 

The stability coefficients for reading comprehension tests are given in 
Table 5. Since three different standardized tests were employed, each one 
operationally defining reading somewhat differently, the coefficients must be 
viewed as conservative estimates. They are, however, generallznbllity coeffi- 
cients (Cronbach, Rnjaratnam, and Closer, 1963) which have allowed both time 
and test battery to vary. The stability coefficients for reading comprehension 
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Table 4 9 

Disattenuated Stability Coefficients for Reading Vocabulary for Samples I, II, and 



Sample I Grades 



Grade 


Test 


1 1 


_3 


4 


5 


6 


2 


9 


U. 


3 


MAT® 






.88 


.92 


.87 


.84 


.83 


.80 


4 


I TBS 








.88 


.88 


.84 


.85 


.83 


5 


ITBS 










.98 


.97 


.92 


.89 


6 


ITBS 












.98 


.95 


.91 


7 


ITBS 














.97 


.95 


9 


ITED 
















.97 


Sample II 






- 














1 


MAT 


.75 


.65 


.67 


.61 


.66 


.67 


.60 




2 


MAT 




.83 


.80 


.71 


.73 


.73 


.63 




3 


MAT 






.88 


.85 


.83 


.81 


.71 




4 


ITBS 






■ 


.92 


.90 


.86 


.78 




5 


ITBS 










.95 


.93 


.88 




6 


ITBS 












.97 


.89 




7 


ITBS 














.91 




Sample III 




















1 


MAT 


00 

0 


.73 


.65 


.69 


.66 








2 


MAT 




.85 


.78 


.77 


.72 








3 


MAT 






.86 


.86 


.83 








4 


ITBS 








.93 


.87 








5 


ITBS 










.95 
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Stability Coefficients of Reading Comprehension 3 Before (Above Diagonals) and 
After (Below Diagonals) Correction for Attenuation Scores for Samples l, II, 

and III. 



TESTS AND GRADE LFVEL 



Sample I 






MAT 






ITBS 






ITED 




r ll 




Grade 


Test 


1 


2 


3 


4 


5 


6 


]_ 


9 


Ai 




3 


MAT 








.82 


.79 


.80 


.77 


.76 


.72 


.96 




A 


ITBS 






.86 




. 79 


.78 


.78 


.77 


.73 


.96 




5 


ITBS 






.82 


.82 




.85 


.82 


.79 


. 78 


.96 


'■ d 


6 


ITBS 






.83 


.81 


.89 




.87 


.85 


.83 


.91. 




7 


ITBS 






.82 


.83 


.85 


.91 




.85 


.85 


.92 




9 


ITED 






.80 


.82 


.84 


.90 


.9? 




.87 


.93 


i. 

■ \-l 


11 


ITED 






.76 


.77 


.82 


.87 


.91 


VO 

4N 




. 9.5 




Sample II 
























'/ft 


1 


MAT 




.59 


.58 


.59 


.57 


.50 


.53 


.53 




.81 




2 


MAT 


.68 




.71 


.66 


.65 


.65 


.62 


.58 




.90 




3 


MAT 


.66 


.76 




.77 


.76 


.75 


•70 


.66 




.96 




4 


ITBS 


.67 


.71 


.81 




.83 


.79 


. 74 b 


.71 




.95 


V 


5 


ITBS 


.65 


.70 


.79 


.87 




.83 


.78 


.78 




.96 




6 


ITBS 


.58 


.72 


.80 


.85 


.89 




.82 


.80 




.91 


•i 


7 


ITBS 


.62 


.68 


.75 


.79 


.82 


.90 




.79 




.92 


| . 

f 

3 


9 


ITED 


.61 


.63 


.70 


.76 


.82 


.87 


.85 






.95 


Sample III 
























}. 

i 


1 


MAT 




.64 


.62 


.61 


.61 


.59 








.83 


.. \ 


2 


MAT 


.73 




.72 


.69 


.71 


.65 








.92 


: 


3 


MAT 


.69 


.77 




.76 


.77 


.73 








.96 


i. 


4 


ITBS 


.69 


.75 


.80 




.81 


.76 








.93 




5 


ITBS 


.69 


.77 


.81 


.87 




.84 








.93 




6 


ITBS 


.67 


.71 


.78 


.83 


.91 










.91 


i 



a Actua! test titles: "Reading" for MAT and ITBS, and the average of three I 

reading interpretation tests (tests 5-7) on the ITED. } 

» i 

“Corresponding value from Merenda and Jackson (1969) was .77. | 
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for reading comprehension are very similar to corresponding values for 
vocabulary given in Table 3; the mean and mode vocabulary stability coeffi- 
cients being .02 and .01 larger than corresponding comprehension values. 

The disattenuated stability coefficients are given below the diagonal 
for each sample in Table 5. These values averaged about .05 less than 
corresponding values for vocabulary indicating that there is less true- 
score stability in comprehension than vocabulary, although both are very 
stable after grade three. (The stability coefficients in Tables 3 and 5 
agree very closely with those from Linn (1969) (5) and Merenda and Jack-ion 
(1969) (6) who studied the grade A - 7 and 5-8 intervals, respectively 
using the ITBS.) 

Summary ■ 

The stability of reading vocabulary and comprehension were studied 
over the grade one to grade eleven interval using three large ijjtompleq of 
students. Substantial long-term stability was reflected in both types of 
tests; grade one scores correlated above .5 with all subsequent measures. 

By the end of the primary grades, students' scores correlated above .70 
with all subsequent measures. When the coefficients were correlated 
for attenuation to allow an estimate of the relationships after errors 
of measurement on the test were removed, the values were about .10 higher. 

Early performance in reading does not represent temporary maturational 
status for most pupils, but has substantial relationship with terminal 
achievement levels in both reading vocabulary and comprehension. 
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